Implement a new --failing-and-slow-first command line argument to test runner. #24624

juj · 2025-06-26T16:36:13Z

This keeps track of results of previous test run, and on subsequent runs, failing tests are run first, then skipped tests, and last, successful tests in slowest-first order. This improves parallelism throughput of the suite.

Add support for --failfast in the multithreaded test suite to help stop suite runs at first test failures quickly.

These two flags --failfast and --failing-and-slow-first together can help achieve < 10 second test suite runs on a CI when the suite is failing.

Example core0 runtime with test/runner core0 on a 16-core/32-thread system:

Total core time: 2818.016s. Wallclock time: 118.083s. Parallelization: 23.86x.

Same suite runtime with test/runner --failing-and-slow-first core0:

Total core time: 2940.180s. Wallclock time: 94.027s. Parallelization: 31.27x.

Gaining a better throughput and a -20.37% test suite wall time.

…t runner. This keeps track of results of previous test run, and on subsequent runs, failing tests are run first, then skipped tests, and last, successful tests in slowest-first order. Add support for --failfast in the multithreaded test suite. This improves parallelism throughput of the suite, and helps stop at test failures quickly.

…failfast mode.

…failure result.

sbc100

IIUC this is what I currently use --failfast --continue for. The downside of --failfast --continue of course is that it doesn't work for parallel testing (so I also add -j1).

.gitignore

sbc100

Actually maybe I misunderstood. I use --failfast --continue when implementing new features and wanting to fix each test failure as I run into it.

How does this improve CI times on the bots? It seems like it would not effect the first run, but only subsequent runs, which the bots don't do, do they?

juj · 2025-08-13T14:30:29Z

How does this improve CI times on the bots? It seems like it would not effect the first run, but only subsequent runs, which the bots don't do, do they?

It doesn't work on the current CircleCI bots, which always start from a clean slate and run all suites from a single command invocation, but it does help if a developer runs test suites locally, and on the ad hoc CI I am running in http://clbri.com:8010/ .

For example, here is one such run:

where all the failing suites fail in a matter of a few seconds, rather than taking a random length to fail.

Also passing suites run faster, since shortest tests are run last, meaning that core utilization will be 100% throughout the test suite run. It is like a self-calibrating version to avoid having to name tests test_zzz_ if they are slow. (which is detrimental to test speed)

juj · 2025-08-15T21:22:38Z

It would be great to get this landed, since this would enable my CI to run against the upstream tree more easily.

sbc100 · 2025-08-15T21:29:27Z

It would be great to get this landed, since this would enable my CI to run against the upstream tree more easily.

It seems like there are couple different things intertwined here.

Perhaps we can tease some of it apart and try to simplify.

The first thing here is making --fail-fast work in the parallel running. That seems like a great idea and maybe we can land that separtely.

Regarding running slow tests first, how about just making that the default? For the initial run we could check in a copy of the test time and update it every few months. Then we wouldn't need a new flag.

Perhaps --failing-first could be a new flags, but again it might make sense to just make it the default?

sbc100 · 2025-08-15T21:29:46Z

Why is this change important for your CI?

sbc100 · 2025-08-15T21:31:01Z

test/parallel_testsuite.py


  def addError(self, test, err):
    print(test, '... ERROR', file=sys.stderr)
    self.buffered_result = BufferedTestError(test, err)
+    self.test_result = 'errored'


It this needed? Isn't the existing buffered_result object enough?

Python doesn't have an API to ask the result of a test object, and it would have required some kind of an awkward isinstance() jungle to convert the test to a string, so I opted to writing simple looking code as a more preferable way.

sbc100 · 2025-08-15T21:32:59Z

test/parallel_testsuite.py

+    try:
+      previous_test_run_results = json.load(open('out/__previous_test_run_results.json'))
+    except FileNotFoundError:
+      previous_test_run_results = {}


Do we need to duplicate this code for handling __previous_test_run_results between here and runner.py?

Perhaps a shared sorting function that they can both call?

This code is not creating a sorter, but I see the shared block amounts to a

def load_previous_test_run_results(): try: return json.load(open('out/__previous_test_run_results.json')) except FileNotFoundError: return {}

I could refactor that to e.g. common.py, though that is not a large win necessarily.

I guess i don't understand quite what is going on here then.. I would have though the new flag --failing-and-slow-first flag would apply equally to the non-parallel and parallel test runner and so it would only need to be handled in a single place (where we decide on the test ordering)... I need to take a deeper look at what is really going on here.

Merged the code.

The new flag only applies to the parallel test runner. If we wanted to extend this to the non-parallel runner, that's fine, though that could be worked on as a later PR as well.

juj · 2025-08-15T21:35:09Z

Why is this change important for your CI?

It allows me to run all suites of tests in a more reasonable time to iterate on seeing different failures. Otherwise the failing suites take 30-50x longer to come around with the failure, making it time consuming to see what is still failing.

juj added 6 commits June 26, 2025 19:33

ruff

1008e6d

ruff

28e4ab8

Improve test run information to flow across suites when running in --…

a8544b2

…failfast mode.

Refactor num_failures into a fail_frequency to get a more normalized …

78aa3fb

…failure result.

ruff

1fd0b64

sbc100 reviewed Jul 7, 2025

View reviewed changes

.gitignore Outdated Show resolved Hide resolved

sbc100 reviewed Jul 7, 2025

View reviewed changes

Save previous test run file to out/

b3cec56

sbc100 reviewed Aug 15, 2025

View reviewed changes

juj added 6 commits August 16, 2025 00:50

Share code for loading previous test run results JSON file.

06420cf

Merge remote-tracking branch 'origin/main' into failing_and_slow_first

15cb203

ruff

846455f

ruff

ddcde4f

Simplify common code in test result writing

01df1ad

Rename

64b09c7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement a new --failing-and-slow-first command line argument to test runner. #24624

Implement a new --failing-and-slow-first command line argument to test runner. #24624

juj commented Jun 26, 2025 •

edited

Loading

Uh oh!

sbc100 left a comment

Uh oh!

Uh oh!

sbc100 left a comment

Uh oh!

juj commented Aug 13, 2025 •

edited

Loading

Uh oh!

juj commented Aug 15, 2025

Uh oh!

sbc100 commented Aug 15, 2025

Uh oh!

sbc100 commented Aug 15, 2025

Uh oh!

sbc100 Aug 15, 2025

Uh oh!

juj Aug 15, 2025

Uh oh!

sbc100 Aug 15, 2025

Uh oh!

juj Aug 15, 2025

Uh oh!

sbc100 Aug 15, 2025

Uh oh!

juj Aug 15, 2025

Uh oh!

juj Aug 15, 2025

Uh oh!

juj commented Aug 15, 2025

Uh oh!

Uh oh!

Implement a new --failing-and-slow-first command line argument to test runner. #24624

Are you sure you want to change the base?

Implement a new --failing-and-slow-first command line argument to test runner. #24624

Conversation

juj commented Jun 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sbc100 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sbc100 left a comment

Choose a reason for hiding this comment

Uh oh!

juj commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

juj commented Aug 15, 2025

Uh oh!

sbc100 commented Aug 15, 2025

Uh oh!

sbc100 commented Aug 15, 2025

Uh oh!

sbc100 Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

juj Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

sbc100 Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

juj Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

sbc100 Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

juj Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

juj Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

juj commented Aug 15, 2025

Uh oh!

Uh oh!

juj commented Jun 26, 2025 •

edited

Loading

juj commented Aug 13, 2025 •

edited

Loading